Compiling queries for high-performance computing

نویسندگان

Brandon Myers

Mark Oskin

Bill Howe

چکیده

Data-intensive applications motivate the integration of highproductivity query languages with high-performance computing runtimes. We present a technique Compiled parallel pipelines (CPP) for compiling relational query plans to programs suitable for high-performance computing platforms. Rather than compose a sequential query compiler with a high-performance communication library like MPI, we take a holistic approach that leverages the capabilities of parallel languages. For each pipeline in the query plan, CPP generates a parallel partitioned global address space (PGAS) program. This approach affords modular design, and it allows the compiler to reason about whole pipelines that include parallelism and communication. Using PGAS to efficiently execute queries requires designing efficient shared data structures, generating code that avoids extra messages, and mitigating the overhead of an execution model based on fine-grained tasks. We implement our technique as a system called RADISH. Our evaluation shows that CPP is 5.5× faster than compiled iterators on TPC-H queries. To show that RADISH is a practical system for in-memory analytics, we also compare the performance of RADISH on TPC-H with the MPP system DBX and find it to be competitive. Our work takes important first steps integrating query processing and distributed HPC.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SESOS: A Verifiable Searchable Outsourcing Scheme for Ordered Structured Data in Cloud Computing

While cloud computing is growing at a remarkable speed, privacy issues are far from being solved. One way to diminish privacy concerns is to store data on the cloud in encrypted form. However, encryption often hinders useful computation cloud services. A theoretical approach is to employ the so-called fully homomorphic encryption, yet the overhead is so high that it is not considered a viable s...

متن کامل

DBToaster: A SQL Compiler for High-Performance Delta Processing in Main-Memory Databases

We present DBToaster, a novel query compilation framework for producing high performance compiled query executors that incrementally and continuously answer standing aggregate queries using in-memory views. DBToaster targets applications that require efficient main-memory processing of standing queries (views) fed by high-volume data streams, recursively compiling view maintenance (VM) queries ...

متن کامل

Green Energy-aware task scheduling using the DVFS technique in Cloud Computing

Nowdays, energy consumption as a critical issue in distributed computing systems with high performance has become so green computing tries to energy consumption, carbon footprint and CO2 emissions in high performance computing systems (HPCs) such as clusters, Grid and Cloud that a large number of parallel. Reducing energy consumption for high end computing can bring various benefits such as red...

متن کامل

Compiling Matlab for High Performance Computing via X 10 1 Sable Technical Report

Matlab is a popular dynamic array-based language commonly used by students, scientists and engineers, who appreciate the interactive development style, the rich set of array operators, the extensive builtin library, and the fact that they do not have to declare static types. Even though these users like to program in Matlab, their computations are often very computeintensive and are better suit...

متن کامل

Fast Query Evaluation with (Lazy) Control Flow Compilation

Learning algorithms such as decision tree learners dynamically generate a huge amount of large queries. Because these queries are executed often, the trade-off between meta-calling and compiling & running them has been in favor of the latter, as compiled code is faster. This paper presents a technique named control flow compilation, which improves the compilation time of the queries by an order...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Compiling queries for high-performance computing

نویسندگان

چکیده

منابع مشابه

SESOS: A Verifiable Searchable Outsourcing Scheme for Ordered Structured Data in Cloud Computing

DBToaster: A SQL Compiler for High-Performance Delta Processing in Main-Memory Databases

Green Energy-aware task scheduling using the DVFS technique in Cloud Computing

Compiling Matlab for High Performance Computing via X 10 1 Sable Technical Report

Fast Query Evaluation with (Lazy) Control Flow Compilation

عنوان ژورنال:

اشتراک گذاری